Robust and Fast Lyric Search based on Phonetic Confusion Matrix
نویسندگان
چکیده
This paper proposes a robust and fast lyric search method for music information retrieval. Current lyric search systems by normal text retrieval techniques are severely deteriorated in the case that the queries of lyric phrases contain incorrect parts due to mishearing and misremembering. To solve this problem, the authors apply acoustic distance, which is computed based on a confusion matrix of an ASR experiment, into DP-based phonetic string matching. The experimental results show that the search accuracy is increased by more than 40% compared with the normal text retrieval method; and by 2% ∼4% compared with the conventional phonetic string matching method. Considering the high computation complexity of DP matching, the authors propose a novel two-pass search strategy to shorten the processing time. By pre-selecting the probable candidates by a rapid index-based search for the first pass and executing a DP-based search among these candidates during the second pass, the proposed method reduces processing time by 85.8% and keeps search accuracy at the same level as that of a complete search by DP matching with all lyrics.
منابع مشابه
Solving Misheard Lyric Search Queries Using a Probabilistic Model of Speech Sounds
Music listeners often mishear the lyrics to unfamiliar songs heard from public sources, such as the radio. Since standard text search engines will find few relevant results when they are entered as a query, these misheard lyrics require phonetic pattern matching techniques to identify the song. We introduce a probabilistic model of mishearing trained on examples of actual misheard lyrics, and d...
متن کاملA fast fuzzy keyword spotting algorithm based on syllable confusion network
This paper presents a fast fuzzy search algorithm to extract keyword candidates from syllable confusion networks (SCNs) in Mandarin spontaneous speech. Since the recognition accuracy of spontaneous speech is quite poor, syllable confusion matrix (SCM) is applied to compensate for the recognition errors and to improve recall. For fast retrieval, an efficient vocabulary-independent index structur...
متن کاملA fuzzy acoustic-phonetic decoder for speech recognition
In this paper, a general framework of acoustic-phonetic modelling is developed. Context sensitive rules are incorporated into a knowledge-based automatic speech recognition (ASR) system and are assessed with control based on fuzzy decision making. The reliability measure is outlined: a tests collection is run and a confusion matrix is built for each rule. During the recognition procedure the fu...
متن کاملPhonetic Representation-Based Speech Translation
This paper explores a tight coupling of Automatic Speech Recognition (ASR) and Machine Translation (MT) for speech translation with information sharing on the phonelevel. Our novel approach allows MT to access fine-grained phonetic information from ASR, as a methodology for facilitating speech translation. Specifically, Phrase-based Statistical MT (PBSMT) models are adapted to work on source la...
متن کاملA Robust Adaptive Observer-Based Time Varying Fault Estimation
This paper presents a new observer design methodology for a time varying actuator fault estimation. A new linear matrix inequality (LMI) design algorithm is developed to tackle the limitations (e.g. equality constraint and robustness problems) of the well known so called fast adaptive fault estimation observer (FAFE). The FAFE is capable of estimating a wide range of time-varying actuator fault...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009